Naïve Bayes vs. Decision Trees vs. Neural Networks in the Classification of Training Web Pages

نویسندگان

  • Daniela XHEMALI
  • Christopher J. HINDE
  • Roger G. STONE
چکیده

Web classification has been attempted through many different technologies. In this study we concentrate on the comparison of Neural Networks (NN), Naïve Bayes (NB) and Decision Tree (DT) classifiers for the automatic analysis and classification of attribute data from training course web pages. We introduce an enhanced NB classifier and run the same data sample through the DT and NN classifiers to determine the success rate of our classifier in the training courses domain. This research shows that our enhanced NB classifier not only outperforms the traditional NB classifier, but also performs similarly as good, if not better, than some more popular, rival techniques. This paper also shows that, overall, our NB classifier is the best choice for the training courses domain, achieving an impressive F-Measure value of over 97%, despite it being trained with fewer samples than any of the classification systems we have encountered.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Catégorisation automatique de pages web chinoises - documents spécialisés vs grand public sur le tabagisme

Text categorization (or supervised classification) generally addresses the topic or the type of a text. We tackle here a different dimension, the intended audience, contrasting two broad categories: texts intended for the general public, or texts intended for specialists. We test the categorization, according to this contrast, of Chinese Web pages about smoking. In this context, we obtain the f...

متن کامل

Is Naïve Bayes a Good Classifier for Document Classification?

Document classification is a growing interest in the research of text mining. Correctly identifying the documents into particular category is still presenting challenge because of large and vast amount of features in the dataset. In regards to the existing classifying approaches, Naïve Bayes is potentially good at serving as a document classification model due to its simplicity. The aim of this...

متن کامل

A New Method to Improve Automated Classification of Heart Sound Signals: Filter Bank Learning in Convolutional Neural Networks

Introduction: Recent studies have acknowledged the potential of convolutional neural networks (CNNs) in distinguishing healthy and morbid samples by using heart sound analyses. Unfortunately the performance of CNNs is highly dependent on the filtering procedure which is applied to signal in their convolutional layer. The present study aimed to address this problem by a...

متن کامل

Study of Trend-Stuffing on Twitter through Text Classification

Twitter has become an important mechanism for users to keep up with friends as well as the latest popular topics, reaching over 20 million unique visitors monthly and generating over 1.2 billion tweets a month. To make popular topics easily accessible, Twitter lists the current most tweeted topics on its homepage as well as on most user pages. This provides a one-click shortcut to tweets relate...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009